Abstract
Introduction: Diffuse large B-cell lymphoma (DLBCL) is the most common non-Hodgkin lymphoma globally. In Latin America (LATAM), limited access to molecular tools and advanced imaging hampers risk stratification. Inflammatory and nutritional biomarkers such as albumin, hemoglobin, lymphocytes, and derived indices (NLR, PLR, LMR, HALP) are increasingly recognized as prognostic tools, but their use to define phenotypic subgroups remains understudied. We aimed to identify immunoinflammatory phenotypes using machine learning and assess their association with survival outcomes in a LATAM cohort.
Methods: We conducted a multicenter, retrospective cohort study using data from the GELL registry (2013–2024), which includes adult patients with DLBCL who received first-line treatment with R-CHOP and were managed at academic centers across LATAM. Patients with incomplete baseline or follow-up data, or those who could not be contacted for outcome verification, were excluded. We calculated inflammatory indices at diagnosis: neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), lymphocyte-to-monocyte ratio (LMR), and the HALP index (Hemoglobin × Albumin × Lymphocytes / Platelets). The primary outcome was overall survival (OS). Variables were standardized prior to unsupervised clustering using k-means. The dataset was randomly divided into a 70% training set and a 30% validation set. Survival analyses were performed using Kaplan–Meier estimators and multivariable Cox proportional hazards models. To facilitate bedside application, a Shiny web application was developed based on the validated Random Forest classifier.
Results: A total of 1,470 patients with newly diagnosed diffuse large B-cell lymphoma (DLBCL) were screened, of which 1,077 met inclusion criteria (median age: 64 years; 50% male). 40% had Ann Arbor stage IV, 48% had ECOG performance status 1, and 38% were classified into high or high-intermediate IPI risk groups. The median follow-up was 51 months (IQR: 38–66). Two immunoinflammatory clusters were identified: Cluster 1 (High-Risk Immunoinflammatory Phenotype) and Cluster 2 (Favorable Immunoinflammatory Phenotype).
Training cohort (n = 754):
Cluster 1 (n = 245) exhibited a high-risk immunoinflammatory phenotype: lower albumin (3.10 vs. 3.90 g/dL), hemoglobin (10.4 vs. 13.6 g/dL), and lymphocyte counts (858 vs. 1,600 cells/mm³), higher NLR (5.90 vs. 2.80) and PLR (341 vs. 167), and lower HALP (0.09 vs. 0.32) and LMR (1.67 vs. 3.00) (all p < 0.001). Mortality was significantly higher in Cluster 1 (51%) compared to Cluster 2 (31%) (p < 0.001). In multivariable Cox regression, Cluster 2 was independently associated with lower mortality risk (HR: 0.62, 95% CI: 0.48–0.79, p < 0.001).
Validation cohort (n = 323):
The Random Forest model retained predictive performance, consistently identifying patients with the adverse immunoinflammatory profile. Predicted Cluster 1 (n = 112) retained the high-risk phenotype and demonstrated worse OS than Cluster 2 (n = 211) (p < 0.0001). Mortality was 53% in Cluster 1 vs. 30% in Cluster 2 (p < 0.001). In multivariate analysis, Cluster 2 remained independently associated with reduced mortality (HR: 0.67, 95% CI: 0.45–99.6, p = 0.051), supporting external validity of the model.
Web Application:
A user-friendly Shiny web app was developed: https://rafaelpichardo.shinyapps.io/GELL-ImmuneDLBCL/
Conclusions: Routine inflammatory and nutritional biomarkers can identify reproducible immunoinflammatory phenotypes with distinct prognoses in DLBCL. Patients within the adverse cluster had significantly inferior OS in both cohorts, independent of IPI. This low-cost approach may improve risk stratification in resource-limited settings. The Shiny app enables bedside application and supports clinical decision-making.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal